skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Ware, Doreen"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract Rice is a vital staple crop, sustaining over half of the global population, and is a key model for genetic research. To support the growing need for comprehensive and accessible rice genomic data, GrameneOryza (https://oryza.gramene.org) was developed as an online resource adhering to FAIR (Findable, Accessible, Interoperable, and Reusable) principles of data management. It distinguishes itself through its comprehensive multispecies focus, encompassing a wide variety of Oryza genomes and related species, and its integration with FAIR principles to ensure data accessibility and usability. It offers a community curated selection of high-quality Oryza genomes, genetic variation, gene function, and trait data. The latest release, version 8, includes 28 Oryza genomes, covering wild rice and domesticated cultivars. These genomes, along with Leersia perrieri and seven additional outgroup species, form the basis for 38 K protein-coding gene family trees, essential for identifying orthologs, paralogs, and developing pan-gene sets. GrameneOryza’s genetic variation data features 66 million single-nucleotide variants (SNVs) anchored to the Os-Nipponbare-Reference-IRGSP-1.0 genome, derived from various studies, including the Rice Genome 3 K (RG3K) project. The RG3K sequence reads were also mapped to seven additional platinum-quality Asian rice genomes, resulting in 19 million SNVs for each genome, significantly expanding the coverage of genetic variation beyond the Nipponbare reference. Of the 66 million SNVs on IRGSP-1.0, 27 million acquired standardized reference SNP cluster identifiers (rsIDs) from the European Variation Archive release v5. Additionally, 1200 distinct phenotypes provide a comprehensive overview of quantitative trait loci (QTL) features. The newly introduced Oryza CLIMtools portal offers insights into environmental impacts on genome adaptation. The platform’s integrated search interface, along with a BLAST server and curation tools, facilitates user access to genomic, phylogenetic, gene function, and QTL data, supporting broad research applications. Database URL: https://oryza.gramene.org 
    more » « less
    Free, publicly-accessible full text available January 1, 2026
  2. Abstract Modern maize (Zea maysssp.mays) was domesticated fromTeosinte parviglumis(Zea maysssp.parviglumis), with subsequent introgressions fromTeosinte mexicana(Zea maysssp.mexicana), yielding increased kernel row number, loss of the hard fruit case and dissociation from the cob upon maturity, as well as fewer tillers. Molecular approaches have identified transcription factors controlling these traits, yet revealed that a complex regulatory network is at play. MaizeCODE deploys ENCODE strategies to catalog regulatory regions in the maize genome, generating histone modification and transcription factor ChIP-seq in parallel with transcriptomics datasets in 5 tissues of 3 inbred lines which span the phenotypic diversity of maize, as well as the teosinte inbred TIL11. Transcriptomic analysis reveals that pollen grains share features with endosperm, and express dozens of “proto-miRNAs” potential vestiges of gene drive and hybrid incompatibility. Integrated analysis with chromatin modifications results in the identification of a comprehensive set of regulatory regions in each tissue of each inbred, and notably of distal enhancers expressing non-coding enhancer RNAs bi-directionally, reminiscent of “super enhancers” in animal genomes. Furthermore, the morphological traits selected during domestication are recapitulated, both in gene expression and within regulatory regions containing enhancer RNAs, while highlighting the conflict between enhancer activity and silencing of the neighboring transposable elements. 
    more » « less
  3. The combination of ultra-long (UL) Oxford Nanopore Technologies (ONT) sequencing reads with long, accurate Pacific Bioscience (PacBio) High Fidelity (HiFi) reads has enabled the completion of a human genome and spurred similar efforts to complete the genomes of many other species. However, this approach for complete, “telomere-to-telomere” genome assembly relies on multiple sequencing platforms, limiting its accessibility. ONT “Duplex” sequencing reads, where both strands of the DNA are read to improve quality, promise high per-base accuracy. To evaluate this new data type, we generated ONT Duplex data for three widely studied genomes: human HG002, Solanum lycopersicum Heinz 1706 (tomato), and Zea mays B73 (maize). For the diploid, heterozygous HG002 genome, we also used “Pore-C” chromatin contact mapping to completely phase the haplotypes. We found the accuracy of Duplex data to be similar to HiFi sequencing, but with read lengths tens of kilobases longer, and the Pore-C data to be compatible with existing diploid assembly algorithms. This combination of read length and accuracy enables the construction of a high-quality initial assembly, which can then be further resolved using the UL reads, and finally phased into chromosome-scale haplotypes with Pore-C. The resulting assemblies have a base accuracy exceeding 99.999% (Q50) and near-perfect continuity, with most chromosomes assembled as single contigs. We conclude that ONT sequencing is a viable alternative to HiFi sequencing for de novo genome assembly, and provides a multirun single-instrument solution for the reconstruction of complete genomes. 
    more » « less
    Free, publicly-accessible full text available November 1, 2025
  4. In plants, vegetative and reproductive development are associated with agronomically important traits that contribute to grain yield and biomass. Zinc finger homeodomain (ZF-HD) transcription factors (TFs) constitute a relatively small gene family that has been studied in several model plants, including Arabidopsis thaliana L. and Oryza sativa L. The ZF-HD family members play important roles in plant growth and development, but their contribution to the regulation of plant architecture remains largely unknown due to their functional redundancy. To understand the gene regulatory network controlled by ZF-HD TFs, we analyzed multiple loss-of-function mutants of ZF-HD TFs in Arabidopsis that exhibited morphological abnormalities in branching and flowering architecture. We found that ZF-HD TFs, especially HB34, negatively regulate the expression of miR157 and positively regulate SQUAMOSA PROMOTER BINDING–LIKE 10 (SPL10), a target of miR157. Genome-wide chromatin immunoprecipitation sequencing (ChIP-Seq) analysis revealed that miR157D and SPL10 are direct targets of HB34, creating a feed-forward loop that constitutes a robust miRNA regulatory module. Network motif analysis contains overrepresented coherent type IV feedforward motifs in the amiR zf-HD and hbq mutant background. This finding indicates that miRNA-mediated ZF-HD feedforward modules modify branching and inflorescence architecture in Arabidopsis. Taken together, these findings reveal a guiding role of ZF-HD TFs in the regulatory network module and demonstrate its role in plant architecture in Arabidopsis. 
    more » « less
  5. Abstract Background Genome-wide association studies (GWAS) aim to correlate phenotypic changes with genotypic variation. Upon transcription, single nucleotide variants (SNVs) may alter mRNA structure, with potential impacts on transcript stability, macromolecular interactions, and translation. However, plant genomes have not been assessed for the presence of these structure-altering polymorphisms or “riboSNitches.” Results We experimentally demonstrate the presence of riboSNitches in transcripts of two Arabidopsis genes, ZINC RIBBON 3 ( ZR3 ) and COTTON GOLGI-RELATED 3 ( CGR3 ), which are associated with continentality and temperature variation in the natural environment. These riboSNitches are also associated with differences in the abundance of their respective transcripts, implying a role in regulating the gene's expression in adaptation to local climate conditions. We then computationally predict riboSNitches transcriptome-wide in mRNAs of 879 naturally inbred Arabidopsis accessions. We characterize correlations between SNPs/riboSNitches in these accessions and 434 climate descriptors of their local environments, suggesting a role of these variants in local adaptation. We integrate this information in CLIMtools V2.0 and provide a new web resource, T-CLIM, that reveals associations between transcript abundance variation and local environmental variation. Conclusion We functionally validate two plant riboSNitches and, for the first time, demonstrate riboSNitch conditionality dependent on temperature, coining the term “conditional riboSNitch.” We provide the first pan-genome-wide prediction of riboSNitches in plants. We expand our previous CLIMtools web resource with riboSNitch information and with 1868 additional Arabidopsis genomes and 269 additional climate conditions, which will greatly facilitate in silico studies of natural genetic variation, its phenotypic consequences, and its role in local adaptation. 
    more » « less
  6. Whiteman, N (Ed.)
    Abstract The genome sequence of the diploid and highly homozygous Vitis vinifera genotype PN40024 serves as the reference for many grapevine studies. Despite several improvements to the PN40024 genome assembly, its current version PN12X.v2 is quite fragmented and only represents the haploid state of the genome with mixed haplotypes. In fact, being nearly homozygous, this genome contains several heterozygous regions that are yet to be resolved. Taking the opportunity of improvements that long-read sequencing technologies offer to fully discriminate haplotype sequences, an improved version of the reference, called PN40024.v4, was generated. Through incorporating long genomic sequencing reads to the assembly, the continuity of the 12X.v2 scaffolds was highly increased with a total number decreasing from 2,059 to 640 and a reduction in N bases of 88%. Additionally, the full alternative haplotype sequence was built for the first time, the chromosome anchoring was improved and the number of unplaced scaffolds was reduced by half. To obtain a high-quality gene annotation that outperforms previous versions, a liftover approach was complemented with an optimized annotation workflow for Vitis. Integration of the gene reference catalogue and its manual curation have also assisted in improving the annotation, while defining the most reliable estimation of 35,230 genes to date. Finally, we demonstrated that PN40024 resulted from 9 selfings of cv. “Helfensteiner” (cross of cv. “Pinot noir” and “Schiava grossa”) instead of a single “Pinot noir”. These advances will help maintain the PN40024 genome as a gold-standard reference, also contributing toward the eventual elaboration of the grapevine pangenome. 
    more » « less
  7. Abstract We review how a data infrastructure for the Plant Cell Atlas might be built using existing infrastructure and platforms. The Human Cell Atlas has developed an extensive infrastructure for human and mouse single cell data, while the European Bioinformatics Institute has developed a Single Cell Expression Atlas, that currently houses several plant data sets. We discuss issues related to appropriate ontologies for describing a plant single cell experiment. We imagine how such an infrastructure will enable biologists and data scientists to glean new insights into plant biology in the coming decades, as long as such data are made accessible to the community in an open manner. 
    more » « less
  8. null (Ed.)
  9. null (Ed.)